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Abstract. In this paper we propose the approach for constructing partitionings 
of hard variants of the Boolean satisfiability problem (SAT). Such partitionings 
can be used for solving corresponding SAT instances in parallel. For the same 
SAT instance one can construct different partitionings, each of them is a set of 
simplified versions of the original SAT instance. The effectiveness of an arbitrary 
partitioning is determined by the total time of solving of all SAT instances from 
it. We suggest the approach, based on the Monte Carlo method, for estimating 
time of processing of an arbitrary partitioning. With each partitioning we asso¬ 
ciate a point in the special finite search space. The estimation of effectiveness of 
the particular partitioning is the value of predictive function in the corresponding 
point of this space. The problem of search for an effective partitioning can be 
formulated as a problem of optimization of the predictive function. We use meta¬ 
heuristic algorithms (simulated annealing and tabu search) to move from point to 
point in the search space. In our computational experiments we found partition¬ 
ings for SAT instances encoding problems of inversion of some cryptographic 
functions. Several of these SAT instances with realistic predicted solving time 
were successfully solved on a computing cluster and in the volunteer computing 
project SAT@home. The solving time agrees well with estimations obtained by 
the proposed method. 


1 Introduction 

The Boolean satisfiability problem (SAT) consists in the following: for an arbitrary 
Boolean formula (formula of the Propositional Calculus) to decide if it is satisfiable, 
i.e. if there exists such an assignment of Boolean variables from the formula that makes 
this formula true. The satisfiability problem for a Boolean formula can be effectively (in 
polynomial time) reduced to the satisfiability problem for the formula in the conjunctive 
normal form (CNF). Hereinafter by SAT instance we mean the satisfiability problem for 
some CNF. 

Despite the fact that SAT is NP-complete (NP-hard as a search problem) it is very 
important because of the wide specter of practical applications. A lot of combinatorial 
problems from different areas can be effectively reduced to SAT |[T|. In the last 10 years 
there was achieved an impressive progress in the effectiveness of SAT solving algo¬ 
rithms. While these algorithms are exponential in the worst case scenario, they display 
high effectiveness on various classes of industrial problems. At the present moment the 
SAT solving algorithms are often used in formal verification, combinatorics, cryptanal¬ 
ysis, bioinformatics and other areas. 
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Because of the high computational complexity of SAT, the development of meth¬ 
ods for solving hard SAT instances in parallel is considered to be relevant. Nowadays 
the most popular approaches to parallel SAT solving are portfolio approach and par¬ 
titioning approach. The former means that one SAT instance is solved using different 
SAT solvers or by the same SAT solver with different settings a. Roughly speaking, in 
the portfolio approach several copies of the SAT solver process the same search space 
in different directions. During their work, they can share information in the form of 
conflict clauses and, in some cases, it makes it possible to increase the solving speed. 
The partitioning approach implies that the original SAT instance is decomposed into a 
family of subproblems and this family is then processed in a parallel or in a distributed 
computing environment. This family is in fact a partitioning of the original SAT in¬ 
stance. The ability to independently process different subproblems makes it possible to 
employ the systems with thousands of computing nodes for solving the original prob¬ 
lem. Such approach allows to solve even some cryptanalysis problems in the SAT form. 
However, for the same SAT instance one can construct different partitionings. In this 
context the question arises; if we have two partitionings, how can we know if one is 
better than the other? Or, if we look at this from the practical point of view, how to And 
if not best partitioning, then at least the one with more or less realistic time required to 
process all the subproblems in it? In the present paper we study these two problems. 

2 Monte Carlo Approach to Statistical Estimation of Effectiveness 
of SAT Partitioning 

Let us consider the SAT for an arbitrary CNF C. The partitioning of C is a set of 
formulas 

C /\Gj,j € {1,..., s} 

such that for any i,j '■ i f j formula C A Gi A Gj is unsatisflable and 

C = C A Gi V ... V C A G,. 

(where “=” stands for logical equivalence). It is obvious that when one has a partition¬ 
ing of the original SAT instance, the satisfiability problems for G A Gj, j G {1,..., s} 
can be solved independently in parallel. 

There exist various partitioning techniques. For example one can construct {Gj}j_^ 
using a scattering procedure, a guiding path solver, lookahead solver and a number of 
other techniques described in 0. Unfortunately, for these partitioning methods it is 
hard in general case to estimate the time required to solve an original problem. From 
the other hand in a number of papers about logical cryptanalysis of several keystream 
ciphers there was used a partitioning method that makes it possible to construct such 
estimations in quite a natural way. In particular, in 05I18I19I21I for this purpose the 
information about the time to solve small number of subproblems randomly chosen 
from the partitioning of an original problem was used. In our paper we give strict formal 
description of this idea within the borders of the Monte Carlo method in its classical 
form na. Also we focus our attention on some important details of the method that 
were not considered in previous works. 
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Consider the satisfiability problem for an arbitrary CNF C over a set of Boolean 
variables X = {xi,... ,Xn}- We call an arbitrary set X = {xi^,... ,Xi^}, X C X a 
decomposition set. Consider a partitioning of C that consists of a set of 2‘^ formulas 

where Gj, j G {1,..., 2^^} are all possible minterms over X. Note that an arbitrary 
formula Gj takes a value of true on a single truth assignment (^a{,..., G {0, l}"^. 

Therefore, an arbitrary formula GAGj is satisfiable if and only if G X/ (^a {,..., 

) is produced by setting values of variables 


is satisfiable. Here G 


X/ (a- 


■ 7 


to corresponding k € {1,..., d} : = aj,..., Xi^ = A set of CNFs 


Ac{X) = [g[x/ 




is called a decomposition family produced by X. It is clear that the decomposition 
family is the partitioning of the SAT instance G. 

Consider some algorithm A solving SAT. In the remainder of the paper we presume 
that A is complete, i.e. its runtime is finite for an arbitrary input. We also presume that 
A is a non-randomized deterministic algorithm. We denote the amount of time required 
for A to solve all the SAT instances from Ac as tc,A . Below we concentrate 

mainly on the problem of estimating tc,A . 

Define the uniform distribution on the set {0, l}'^. With each randomly chosen truth 
assignment (cti,..., aj) from {0,1}'^ we associate a value ^c,A (cn, ■ • ■ j otd) that is 
equal to the time required for the algorithm A to solve SAT for G X / (ai, ... ,ad) ■ 

Let ,..., be all the different values that ^c,A (cti , ■ ■ ■ ,ctd) takes on all the possible 
(ai,..., ad) G {0, l}'^. Below we use the following notation 


CcM ( 1 ) 


( 1 ) 


Denote the number of (ai ,ad), such that ^c,A (cti j ■ • ■, ctd) = as . Associate 
with O the following set 


P 


{^C,A (X)) 


We say that the random variable ^c,A ^-A^ 
the following equality holds 

Q 


2d ’ • ■ ■’ 2d J ■ 
has distribution P 


{ic,A Note that 


tc,A {x) =^{e- se") = 2^^ • E • 




k^l 


2d 


Therefore, 


tc,A (x) = 2'^ • E (^) 


(2) 
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To estimate the expected value E 


we will use the Monte Carlo method 


ic,A 

El. According to this method, a probabilistic experiment that consists of N indepen¬ 
dent observations of values of an arbitrary random variable ^ is used to approximately 
calculate E [^]. Let ..., be results of the corresponding observations. They can 
be considered as a single observation of N independent random variables with the same 
distribution as If E [^] and Var (^) are both finite then from the Central Limit Theo¬ 
rem we have the main formula of the Monte Carlo method 


Pr 


1 ^ 




< 


b-y ■ (7 

/A 


= 7- 


(3) 


Here a = yJVar (^) stands for a standard deviation, 7 - for a confidence level, 7 = 
where (•) is the normal cumulative distribution function. It means that under 
the considered assumptions the value 


1 ^ 

-■Fc^ 

N 

i=i 

is a good approximation of E [^], when the number of observations N is large enough. 

In our case from the assumption regarding the completeness of the algorithm A it 
follows that random variable ^c,a{X) has finite expected value and finite variance. We 
would like to mention that an algorithm A should not use randomization, since if it does 
then the observed values in the general case will not have the same distribution. The fact 
that N can be significantly less than 2‘^ makes it possible to use the preprocessing stage 
to estimate the effectiveness of the considered partitioning. 

So the process of estimating the value (|2]i for a given X is as follows. We randomly 
choose N truth assignments of variables from X 

= {al, = (af,..., a^) . (4) 

Below we refer to 0 as random sample. Then consider values 

C' =^C.A {a^),j = l,...,N 


and calculate the value 


fc,^(^)=2^-fA|:cA. (5) 

By the above, if N is large enough then the value of Fq^a can be considered 
as a good approximation of 0. Therefore, instead of searching for a decomposition set 
with minimal value 0 one can search for a decomposition set with minimal value of 
Fc,a (■)■ Below we refer to function Fc,a (■) as predictive function. 
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3 Algorithms for Minimization of Predictive Function 


As we already noted above, different partitionings of the same SAT instance can have 
different values of tc,A . In practice it is important to be able to find partitionings 
that can be processed in realistic time. Below we will describe the scheme of automatic 
search for good partitionings that is based on the procedure minimizing the predictive 
function value in the special search space. 

So we consider the satisfiability problem for some CNF C. Let X = {xi,..., Xn) 
be the set of all Boolean variables in this CNF and X C X be an arbitrary decomposi¬ 
tion set. The set X can be represented by the binary vector x = (xi,..., Xn)- Here 


X^ = 


l,if Xi € X 
0,if Xi^ X 




With an arbitrary vector x C {0,1}" we associate the value of function F(x) computed 
in the following manner. For vector x we construct the corresponding set X (it is formed 
by variables from X that correspond to 1 positions in x)- Then we generate a random 


sample 




€ {0,(see 0) and solve SAT for CNFs C X/a- 


. For 


each of these SAT instances we measure — the runtime of algorithm A on the input 
C X ja^ . After this we calculate the value of Fc,a according to 0. As a result 


we have the value of F(x) in the considered point of the search space. 

Now we will solve the problem F{x) min over the set {0,1}". Of course, the 
problem of search for the exact minimum of function F(x) is extraordinarily complex. 
Therefore our main goal is to find in affordable time the points in {0,1}" with relatively 
good values of function F{-). Note that the function F{-) is not specified by some 
formula and therefore we do not know any of its analytical properties. That is why to 
minimize this function we use metaheuristic algorithms: simulated annealing and tabu 
search. 

First we need to introduce the notation. By 5ft we denote the search space, for exam¬ 
ple, 5ft = {0,1}", however, as we will see later, for the problems considered one can use 
the search spaces of much less power. The minimization of function F(-) is considered 
as an iterative process of transitioning between the points of the search space: 


x° -)■ x^ X* X*- 


By Np (x) we denote the neighborhood of point x of radius p in the search space 5ft. 
The point from which the search starts we denote as Xstart- We will refer to the decom¬ 
position set specified by this point as Xstart- The current Best Known Value of F(-) 
is denoted by Fbest- The point in which the Fhest was achieved we denote as Xbest- 
By Xcenter we denote the point the neighborhood of which is processed at the current 
moment. We call the point, in which we computed the value F{-), a checked point. The 
neighborhood Np (x) in which all the points are checked is called checked neighbor¬ 
hood. Otherwise the neighborhood is called unchecked. 

According to the scheme of the simulated annealing ini, the transition from x* to 
X*^^ is performed in two stages. First we choose a point x* from Np (x*)- The point 
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X® becomes the point with the probability denoted as Pr {x* —This 
probability is defined in the following way: 


Pr{x*^X*+'lx*} 


1 , ifF{r)<F{x^) 

exp , if F (x*) > F (x*) 


In the pseudocode of the algorithm demonstrated below, the function that tests if the 
point X® becomes x*^^> called PointAccepted (this function returns the value of 
true if the transition occurs and false otherwise). The change of parameter Ti cor¬ 
responds to decreasing the “temperature of the environment” HU (in the pseudocode 
by decreaseTemperature 0 we denote the function which implements this proce¬ 
dure). Usually it is assumed that Ti = Q ■ Ti_i, i > 1, where Q G (0,1). The process 
starts at some initial value Tg and continues until the temperature drops below some 
threshold value T^f (in the pseudocode the function that checks this condition is called 
temper atureLimitReachedO). 


Algorithm 1: Simulated annealing algorithm for minimization of the predictive 
function _ 

Input: CNF C, initial point Xstart 

Output: Pair (xbest, Fbest), where Fb^st is a prediction for C, Xbest is a corresponding 
decomposition set 

1 ^iXcenter ■; Fbest) t {Xstart^ F(^Xstart')} 


2 repeat 

3 bestValueUpdated ^ false 

4 P = 1 

5 repeat 

6 X unchecked point from Np{xcenter) 

7 compute F{x) 

8 mark x as checked point in Np{xcenter) 

9 if PointAccepted (x) then 

1 " iXbi^st, Fbest) ^ (x, Fix)} 

Xcenter ^ Xbest 

12 bestValueUpdated ^ true 

13 if (Npixcenter) A chccked) and (not bestValueUpdated) then 

14 |_ p = p -I- 1 

15 decreaseTemperature 0 

16 until bestValueUpdated 

17 until timeExceededO or temperatureLimitReachedO 

18 return (xbestiUiest) 


// check neighborhood 


Also for the minimization of T"(-) we employed the tabu search scheme ID. Accord¬ 
ing to this approach the points from the search space, in which we already calculated 
the values of function F(-) are stored in special tabu lists. When we try to improve the 
current Best Known Value of F( ) in the neighborhood of some point Xcenter then for 
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an arbitrary point x from the neighborhood we first check if we haven’t computed F{x) 
earlier. If we haven’t and, therefore, the point x is not contained in tabu lists, then we 
compute F{x). This strategy is justified in the case of the minimization of predictive 
function F{-) because the computing of values of the function in some points of the 
search space is very expensive. The use of tabu lists makes it possible to significantly 
increase the number of points of the search space processed per time unit. 

Let us describe the tabu search algorithm for minimization F( ) in more detail. To 
store the information about points, in which we already computed the value of F{-) we 
use two tabu lists Li and L 2 . The Li list contains only points with checked neighbor¬ 
hoods. The L 2 list contains checked points with unchecked neighborhoods. Below we 
present the pseudocode of the tabu search algorithm for F( ) minimization. 


Algorithm 2: Tabu search altorithm for minimization of the predictive function 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 
1 « 


Input: CNF C, initial point Xatart 

Output: Pair {xbest, Fbest), where Fbest is a prediction for C, Xbest is a corresponding 
decomposition set 

(Xcenter 5 Fjjest ) t {Xatart ^ F (^Xstart'}'} 

(Li, L 2 ) ■«— (0, Xstart) // initialize tabu lists 

repeat 

bestValueUpdated false 

repeat // check neighborhood 

X any unchecked point from Np{xcenter) 
compute F{x) 

markPointInTabuListsCx, Li, L 2 ) // update tabu lists 

if F{x) < Fbest then 

{Xbest ^ Fbest) ^ {X^F{x)) 

bestValueUpdated <— true 

until Np{xcenter) IS cliecked 
if bestValueUpdated then Xcenter -f— Xbest 
else Xcenter t—getNewCenter(L 2 ) 
until timeExceededO or L 2 = 0 

return {xbest, Fbest) 


In this algorithm the function markPointInTabuLists(x, Li, L 2 ) adds the point 
X to L2 and then marks x as checked in all neighborhoods of points from L2 that con¬ 
tain X- If as a result the neighborhood of some point x' becomes checked, the point x' 
is removed from L 2 and is added to Li. If we have processed all the points in the neigh¬ 
borhood of Xcenter but could not improve the F^est then as the new point Xcenter we 
choose some point from L 2 . It is done via the function getNewCenter (L 2 ) ■ To choose 
the new point in this case one can use various heuristics. At the moment the tabu search 
algorithm employs the following heuristic: it chooses the point for which the total con¬ 
flict activity 02 of Boolean variables, contained in the corresponding decomposition 
set, is the largest. 
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As we already mentioned above, taking into account the features of the considered 
SAT problems makes it possible to significantly decrease the size of the search space. 
For example, knowing the so called Backdoor Sets ll^ can help in that matter. Let us 
consider the SAT instance that encodes the inversion problem of the function of the 
kind / : {0,1}^’ —>■ {0,1}^ Let S{f ) be the Boolean circuit implementing /. Then the 
set Xin, formed by the variables encoding the inputs of the Boolean circuit S{f), is the 
so called Strong Unit Propagation Backdoor Set mni . It means that if we use Xin as the 
decomposition set, then the CDCL (Conflict-Driven Clause Learning ifT^ I solver will 


solve SAT for any CNF of the kind C 


X,n/a ,a € on the preprocessing 


stage, i.e. very fast. Therefore the set Xin can be used as the set Xstart in the predictive 
function minimization procedure. Moreover, in this case it is possible to use the set 
in the role of the search space 5FJ. In all our computational experiments we followed this 
path. 


4 Computational Experiments 

The algorithms presented in the previous section were implemented as the MPI-program 
PDS AiQ. In PD SAT there is one leader process, all the other are computing processes 
(each process corresponds to 1 CPU core). 

The leader process selects points of the search space (we use neighborhoods of 
radius p = 1). For every new point x = X leader process creates a random 

sample (HI of size N. Each assignment from (HI in combination with the original CNF C 
define the SAT instance from the decomposition family Aq . These SAT instances 
are solved by computing processes. The value of the predictive function is always com¬ 
puted assuming that the decomposition family will be processed by 1 CPU core. The 
fact that the processing of Aq consists in solving independent subproblems makes 

it possible to extrapolate the estimation obtained to an arbitrary parallel (or distributed) 
computing system. The computing processes use MiniSat solvefl This solver was 
modified to be able to stop computations upon receiving non-blocking messages from 
the leader process. 

Below we present the results of computational experiments in which PD SAT was 
used to estimate the time required to solve problems of logical cryptanalysis of the A5/1 
Cl, Bivium a and Grain a keystream generators. The SAT instances that encode 
these problems were produced using the Transalg system M- 

4.1 Time Estimations for Logical Cryptanalysis of A5/1 

For the first time we considered the logical cryptanalysis of the A5/1 keystream gener¬ 
ator in El. In that paper we described the corresponding algorithm in detail, therefore 
we will not do it in the present paper. 

’ https://github.com/Nauchnik/pdsat 

^ http://minisat.se 
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We considered the cryptanalysis problem for the A5/1 key stream generator in the 
following form: given the 114 bits of keystream we needed to find the secret key of 
length 64 bits, which produces this keystream (in accordance with the A5/1 algorithm). 
The PD SAT program was used to find partitionings with good time estimations for 
CNFs encoding this problem. The computational experiments were performed on the 
computing cluster “Academician V.M. Matrosov’H One computing node of this cluster 
consists of 2 AMD Opteron 6276 CPUs (32 CPU cores in total). In each experiment 
PDS AT was launched for 1 day using 2 computing nodes (i.e. 64 CPU cores). We used 
random samples of size N = 10^. 

On Figures [1] |2^ |22 three decomposition sets are shown. We described the first 
decomposition set (further referred to as ^i) in the paper El. This set (consisting of 
31 variables) was constructed “manually” based on the analysis of algorithmic features 
of the A5/1 generator. The second one (S' 2 ), consisting of 31 variables, was found as a 
result of the minimization of F (•) by the simulated annealing algorithm (see section 3). 
The third decomposition set {S 3 ), consisting of 32 variables, was found as a result of 
minimization of F (•) by the tabu search algorithm. In the Table[T]the values of F (•) (in 
seconds) for all three decomposition sets are shown. Note that each of decomposition 
sets S 2 and S 3 was found for one 114 bit fragment of keystream that was generated 
according to the A5/1 algorithm for a randomly chosen 64-bit secret key. Since the 
estimations obtained turned out to be realistic, we decided that it would be interesting 
to solve non-weakened cryptanalysis instances for A5/1. For this purpose we used the 
volunteer computing project SAT@home. 



4.2 Solving Cryptanalysis Instances for A5/1 

Volunteer computing fl is a type of distributed computing which uses computational 
resources of PCs of private persons called volunteers. Each volunteer computing project 
is designed to solve one or several hard problems. SAT@hom41 ifT^ is a BOINC-based 

^ http://hpc.icc.ru 
* http://sat.isa.ru/pdsat/ 
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(a) S 2 '■ found by simulated annealing 
Fig. 2: Decomposition sets found by 



PD SAT for cryptanalysis of A5/1 


Table 1: Decomposition sets for logical cryptanalysis of A5/1 and corresponding values 


Set 

Power of set 

F(.) 

Si 

31 

4.45140e-H08 

S 2 

31 

4.78318e-H08 

S 3 

32 

4.64428e-H08 


volunteer computing project aimed at solving hard combinatorial problems that can 
be effectively reduced to SAT. It was launched on September 29, 2011 by ISDCT SB 
RAS and IITP RAS. On February 7, 2012 SAT@home was added to the official list of 
BOINC project^. 

The experiment aimed at solving 10 cryptanalysis instances for the A5/1 keystream 
generator was held in SAT@home from December 2011 to May 2012. To construct the 
corresponding tests we used the known rainbow-tables for the A5/1 algorithrrl§ These 
tables provide about 88% probability of success when analyzing 8 bursts of keystream 
(i.e. 914 bits). We randomly generated 1000 instances and applied the rainbow-tables 
technique to analyze 8 bursts of keystream, generated by A5/1. Among these 1000 in¬ 
stances the rainbow-tables could not find the secret key for 125 problems. From these 
125 instances we randomly chose 10 and in the computational experiments applied the 
SAT approach to the analysis of first bursts of the corresponding keystream fragments 
(114 bits). For each SAT instance we constructed the partitioning generated by the 
decomposition set (see Figure[T]) and processed it in the SAT@home project. All 10 in¬ 
stances constructed this way were successfully solved in SAT@home (i.e. we managed 
to find the corresponding secret keys) in about 5 months (the average performance of 
the project at that time was about 2 teraflops). The second experiment on the cryptanal¬ 
ysis of A5/1 was launched in SAT@home in May 2014. It was done with the purpose of 
testing the decomposition set found by tabu search algorithm. In particular we took the 

^ http://boinc.berkeley.edu/projects.php 

^ https://opensource.srlabs.de/projects/a51-decrypt 
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decomposition set S 3 (see Figure l2bli. On September 26, 2014 we successfully solved 
in SAT@home all 10 instances from the considered series. 

It should be noted that in all the experiments the time required to solve the problem 
agrees with the predictive function value computed for the desomposition sets Si and 
S 3 . Our computational experiments clearly demonstrate that the proposed method of 
automatic search for decomposition sets makes it possible to construct SAT partition¬ 
ings with the properties close to that of “reference” partitionings, i.e. partitionings con¬ 
structed based on the analysis of algorithmic features of the considered cryptographic 
functions. 


4.3 Time Estimations for Logical Cryptanalysis of Bivinm and Grain 


The Bivium keystream generator Q uses two shift registers of a special kind. The first 
register contains 93 cells and the second contains 84 cells. To initialize the cipher, a 
secret key of length 80 bit is put to the first register, and a fixed (known) initialization 
vector (IV) of length 80 bit is put to the second register. All remaining cells are filled 
with zeros. An initialization phase consists of 708 rounds during which keystream out¬ 
put is not released. 

The Grain keystream generator 13 also uses 2 shift registers; first is 80-bit nonlinear 
feedback shift register (NFSR), second is 80-bit linear feedback shift register (LFSR). 
To mix registers outputs the cipher uses a special filter function h{x). To initialize the 
cipher an 80-bit secret key is put into NFSR and a fixed (known) 64-bit initialization 
vector is put to LFSR. All remaining cells are filled with ones. Then cipher works in a 
special mode for 160 rounds. It does not release keystream output during initialization. 

In accordance with 0131181 we considered cryptanalysis problems for Bivium and 
Grain in the following formulation. Based on the known fragment of keystream we 
search for the values of all registers cells at the end of the initialization phase. It means 
that we need to find 177 bits in case of Bivium and 160 bits in case of Grain. Therefore, 
in our experiments we used CNF encodings where the initialization phase was omitted. 

Usually it is believed that to uniquely identify the secret key it is sufficient to con¬ 
sider keystream fragment of length comparable to the total length of shift registers. Here 
we followed 1151 181 and set the keystream fragment length for Bivium cryptanalysis to 
200 bits and for Grain cryptanalysis to 160 bits. 

In our computational experiments we applied PD SAT to SAT instances that encode 
the cryptanalysis of Bivium and Grain according to the formulation described above. 

In these experiments to minimize the predictive functions we used only the tabu 
search algorithm, since compared to the simulated annealing it traverses more points of 
the search space per time unit. Also we noticed that the decomposition set for the A5/1 
cryptanalysis, constructed by the tabu search algorithm, is closer to the “reference” set 
than that constructed with the help of simulated annealing. 

In the role of Xstart for the cryptanalysis of Bivium and Grain we chose the set 
formed by the variables encoding the cells of registers of the generator considered at the 
end of the initialization phase. Further we refer to these variables as starting variables. 


Therefore 




start 


= 177 in case of Bivium, and 


X 


start 


160 in case of Grain. 


In each predictive function minimization experiment PD SAT used random samples of 
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size N = 10^ SAT instances and worked for 1 day using 5 computing nodes (160 
CPU cores in total) within the computing cluster “Academician V.M.Matrosov”. So 
there was 1 leader process and 159 computing processes. Time estimations obtained 
are Fbe^t = 3.769 x 10^° for Bivium and Fbest = 4.368 x 10^° seconds for Grain. 
Corresponding decomposition set Xbest for Bivium is marked with gray on Figure [3] 
(50 variables) and the decomposition set for Grain is marked with gray on Figure |4](69 
variables). Interesting fact is that Xbest for Grain contains only variables corresponding 
to the LFSR cells. 




•Pi 





Fig. 3: Decomposition set of 50 variables found by PDS AT for Bivium cryptanalysis 



Fig. 4: Decomposition set of 69 variables found by PDS AT for Grain cryptanalysis 

In 05I18I19I a number of time estimations for logical cryptanalysis of Bivium were 
proposed. In particular, in ||5] several fixed types of decomposition sets {strategies in 
the notation of ||5]) were analyzed. The best decomposition set from Q consists of 
45 variables encoding the last 45 cells of the second shift register. Note that in Q the 
corresponding estimation of time equal to 1.637 x 10^^ was calculated using random 
samples of size 10^. In 0181191 the estimations of runtime for CryptoMiniSat SAT 
solver, working with SAT instances encoding Bivium cryptanalysis, were proposed. 
From the description of experiments in these papers it can be seen that authors used 
the Monte Carlo method to estimate the sets of variables chosen by CryptoMiniSat 
during the solving process and extrapolated the estimations obtained to time points of 
the solving process that lay in the distant future. Apparently, as it is described in II18I19I . 
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the random samples of size 10^ and 10^ were used. In the Table|2]all three estimations 
mentioned above are demonstrated. The performance of one core of the processor we 
used in our experiments is comparable with that of one core of the processor used in 

lfMT9l . 


Table 2: Time estimations for the Bivium cryptanalysis problem 


Source 

N 

Time estimation 

From (5) 

10^ 

1.637 X lO^'^ 

From 11 81 191 

T(F 

9.718 X 10^'^ 

Found by PDSAT 

TcP 

3.769 X 10"'" 


4.4 Solving Weakened Cryptanalysis Instances for Bivium and Grain 

For solving weakened cryptanalysis instances for Bivium and Grain we used the com¬ 
puting cluster (by running PD SAT in the solving mode) and the volunteer computing 
project SAT@home. 

In the solving mode of PD SAT for X^^st found during predictive function mini¬ 
mization all assignments of variables from X^^st are generated. PD SAT solves 

all corresponding SAT instances. To compare obtained time estimations with real solv¬ 
ing time we used PD SAT to solve several weakened cryptanalysis problems for Bivium 
and Grain. Below we use the notation BiviumK (GminK) to denote a weakened prob¬ 
lem for Bivium (Grain) with known values of K starting variables encoding the last K 
cells of the second shift register. We solved 3 instances for each of weakened problems: 
Biviuml6, Biviuml4, Biviuml2, Grain44, Grain42 and Grain40. 

In the following experiments for each weakened problem we computed the estima¬ 
tion for the first instance from the corresponding series and used the obtained decompo¬ 
sition set for all 3 instances from the series. To get more statistical data we did not stop 
the solving process after the satisfying solution was found, thus processing the whole 
decomposition family. In the Table [3 for each weakened problem we show the time 
required to solve it using 15 computing nodes (480 CPU cores total) of “Academician 
V.M. Matrosov”. The estimation of time was computed for the instance 1 in all cases. 
The estimation for 480 CPU cores is based on the estimation for 1 CPU core. Accord¬ 
ing to the results from this table, on average the real solving time deviates from the 
estimation by about 8%. 

We also solved the BiviumO problem in the volunteer computing project SAT@home. 
With the help of PDSAT the decomposition set formed of 43 variables was found. Us¬ 
ing this decomposition set 5 instances of Bivium9 were solved in SAT@home in about 
4 months from September 2014 to December 2014. During this experiment the average 
performance of the project was about 4 teraflops. 

It should be noted that for all considered BiviumK and GrainK problems the time re¬ 
quired to solve the corresponding instances on the computing cluster and in SAT@home 
agrees well with values of the predictive function found by our approach. 
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Problem 

1 ^best 

Fbest 

Ac{Xb 

sst) on 480 cores 

Finding 

SAT on 480 cores 

1 core 

480 cores 

inst. 1 

inst. 2 

inst. 3 

inst. 1 

inst. 2 

inst. 3 

Bivium 16 

31 

1.65e7 

3.44e4 

3.42e4 

3.36e4 

3.42e4 

1.10e3 

2.33e4 

2.67e4 

Biviuml4 

35 

6.84e7 

1.42e4 

1.34e5 

1.32e5 

1.33e5 

3.95e2 

9.10e4 

9.18e4 

Bivium 12 

37 

2.63e8 

5.50e5 

4.95e5 

4.83e5 

5.28e5 

3.04e5 

1.39e5 

1.89e5 

Grain44 

29 

1.60e7 

3.36e4 

3.61e4 

4.5 le4 

3.73e4 

1.34e3 

1.35e4 

8.24e2 

Grain42 

29 

6.05e7 

1.26e5 

1.35e5 

1.30e5 

1.20e5 

6.92e4 

1.07e5 

9.15e4 

Grain40 

32 

2.52e8 

5.27e5 

5.79e5 

5.73e5 

5.06e5 

3.10e5 

5.10e5 

3.20e5 


Table 3: Solving weakened cryptanalysis problems for Bivium and Grain 


5 Related Work 

Some problems regarding the construction of SAT partitionings were studied in ||9l. In 
the papers 05I18I19II the cryptanalysis of the Bivium cipher was considered as a SAT 
problem. The approach used in these papers is close to the one proposed by us. In 
particular the effectiveness of the SAT partitioning was estimated based on the average 
solving time of SAT instances, randomly chosen from the corresponding partitioning. 
However, there was no justihcation of this approach from the Monte Carlo method point 
of view (in its classical sense). Also these papers did not introduce the concept of the 
predictive function and did not consider the problem of search for effective partitionings 
as a problem of optimization of predictive function. 

The most effective in practice method of cryptanalysis of A5/1 is the Rainbow 
method, partial description of which can be found on the A5/1 Cracking Project sit^l 
InQ a number of techniques, used in the A5/1 Cracking Project to construct Rainbow 
tables, was presented. The cryptanalysis of A5/1 via Rainbow tables has the success 
rate of approximately 88% if one uses 8 bursts of keystream. The success rate of the 
Rainbow method if one has only 1 burst of keystream is about 24%. In all our com¬ 
putational experiments we analyzed the keystream fragment of size 1 Mbits, i.e. one 
burst. In im we described our hrst experience on the application of the SAT approach 
to A5/1 cryptanalysis in the specially constructed grid system BNB-Grid. In that paper 
we found the Si set (see section 4.1) manually based on the peculiarities of the A5/1 
algorithm. 


Acknowledgements The authors wish to thank Stepan Kochemazov for numerous 
valuable comments. This work was partly supported by Russian Foundation for Basic 
Research (grants 14-07-00403-a and 15-07-07891-a). 


References 

1. Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability, Frontiers 
in Artificial Intelligence and Applications, vol. 185. lOS Press (2009) 

’’ https://opensource.srlabs.de/projects/a51-decrypt 



















Using Monte Carlo Method for Searching Partitionings 


15 


2. Biryukov, A., Shamir, A., Wagner, D.: Real Time Cryptanalysis of A5/1 on a PC. In: 
Schneier, B. (ed.) FSE. LNCS, vol. 1978, pp. 1-18. Springer (2000) 

3. Canniere, C.D.: Trivium; A stream cipher construction inspired by block cipher design prin¬ 
ciples. In: Katsikas, S.K., Lopez, J., Backes, M., Gritzalis, S., Preneel, B. (eds.) ISC. LNCS, 
vol. 4176, pp. 171-186. Springer (2006) 

4. Durrani, M.N., Shamsi, J.A.: Volunteer computing: requirements, challenges, and solutions. 
J. Network and Computer Applications 39, 369-380 (2014) 

5. Eibach, T., Pilz, E., Vblkel, G.: Attacking Bivium Using SAT Solvers. In: Biining, H.K., 
Zhao, X. (eds.) SAT. Lecture Notes in Computer Science, vol. 4996, pp. 63-76. Springer 
(2008) 

6. Glover, E, Laguna, M.: Tabu Search. Kluwer Academic Publishers (1997) 

7. Guneysu, T, Kasper, T, Novotny, M., Paar, C., Rupp, A.: Cryptanalysis with COPA- 
COBANA. IEEE Trans. Comput. 57(11), 1498-1513 (Nov 2008) 

8. Hell, M., Johansson, T, Meier, W.: Grain: a stream cipher for constrained environments. 
IJWMC 2(1), 86-93 (2007) 

9. Hyvarinen, A.E.J.: Grid Based Propositional Satisfiability Solving. Ph.D. thesis, Aalto Uni¬ 
versity (2011) 

10. Jarvisalo, M., Junttila, T.A.: Limitations of restricted branching in clause learning. Con¬ 
straints 14(3), 325-356 (2009) 

11. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. SCIENCE 
220(4598), 671-680 (1983) 

12. Marques-Silva, J., Lynce, I., Malik, S.: Conflict-Driven Clause Learning SAT Solvers. In: 
Biere et al. (T), pp. 131-153 

13. Maximov, A., Biryukov, A.: Two trivial attacks on trivium. In: Adams, C.M., Miri, A., 
Wiener, M.J. (eds.) Selected Areas in Cryptography. Lecture Notes in Computer Science, 
vol. 4876, pp. 36-55. Springer (2007) 

14. Metropolis, N., Ulam, S.: The Monte Carlo Method. J. Amer. statistical assoc. 44(247), 335- 
341 (1949) 

15. Otpuschennikov, I., Semenov, A., Kochemazov, S.: Transalg: a tool for translating procedural 
descriptions of discrete functions to SAT (tool paper). CoRR abs/1405.1544 (2014) 

16. Posypkin, M., Semenov, A., Zaikin, O.: Using BOINC desktop grid to solve large scale SAT 
problems. Computer Science Journal 13(1), 25-34 (2012) 

17. Semenov, A., Zaikin, O., Bespalov, D., Posypkin, M.: Parallel Logical Cryptanalysis of the 
Generator A5/1 in BNB-Grid System. In: Malyshkin, V. (ed.) PaCT. LNCS, vol. 6873, pp. 
473^83. Springer (2011) 

18. Soos, M.: Grain of Salt - an Automated Way to Test Stream Ciphers through SAT Solvers. 
In: Tools’ 10: Proceedings of the Workshop on Tools for Cryptanalysis, pp. 131-144 (2010) 

19. Soos, M., Nohl, K., Castelluccia, C.: Extending SAT Solvers to Cryptographic Problems. In: 
Kullmann, O. (ed.) SAT. LNCS, vol. 5584, pp. 244-257. Springer (2009) 

20. Williams, R., Gomes, C.P., Selman, B.: Backdoors to typical case complexity. In: Gottlob, 
G., Walsh, T. (eds.) IJCAI. pp. 1173-1178. Morgan Kaufmann (2003) 

21. Zaikin, O., Semenov, A.: Large-block parallelism technology in SAT problems (in Russian). 
Control Sciences 1, 43-50 (2008) 


